Pesquisa | Portal Regional da BVS

1.

Genomic data resources of the Brain Somatic Mosaicism Network for neuropsychiatric diseases.

Garrison, McKinzie A; Jang, Yeongjun; Bae, Taejeong; Cherskov, Adriana; Emery, Sarah B; Fasching, Liana; Jones, Attila; Moldovan, John B; Molitor, Cindy; Pochareddy, Sirisha; Peters, Mette A; Shin, Joo Heon; Wang, Yifan; Yang, Xiaoxu; Akbarian, Schahram; Chess, Andrew; Gage, Fred H; Gleeson, Joseph G; Kidd, Jeffrey M; McConnell, Michael; Mills, Ryan E; Moran, John V; Park, Peter J; Sestan, Nenad; Urban, Alexander E; Vaccarino, Flora M; Walsh, Christopher A; Weinberger, Daniel R; Wheelan, Sarah J; Abyzov, Alexej.

Sci Data ; 10(1): 813, 2023 11 20.

Artigo em Inglês | MEDLINE | ID: mdl-37985666

RESUMO

Somatic mosaicism is defined as an occurrence of two or more populations of cells having genomic sequences differing at given loci in an individual who is derived from a single zygote. It is a characteristic of multicellular organisms that plays a crucial role in normal development and disease. To study the nature and extent of somatic mosaicism in autism spectrum disorder, bipolar disorder, focal cortical dysplasia, schizophrenia, and Tourette syndrome, a multi-institutional consortium called the Brain Somatic Mosaicism Network (BSMN) was formed through the National Institute of Mental Health (NIMH). In addition to genomic data of affected and neurotypical brains, the BSMN also developed and validated a best practices somatic single nucleotide variant calling workflow through the analysis of reference brain tissue. These resources, which include >400 terabytes of data from 1087 subjects, are now available to the research community via the NIMH Data Archive (NDA) and are described here.

Assuntos

Transtornos Mentais , Humanos , Transtorno do Espectro Autista/genética , Encéfalo , Genômica , Mosaicismo , Genoma Humano , Transtornos Mentais/genética

2.

Genome-wide methylation patterns from canine nanopore assemblies.

Schall, Peter Z; Winkler, Paige A; Petersen-Jones, Simon M; Yuzbasiyan-Gurkan, Vilma; Kidd, Jeffrey M.

G3 (Bethesda) ; 13(11)2023 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-37681359

RESUMO

Recent advances in long-read sequencing have enabled the creation of reference-quality genome assemblies for multiple individuals within a species. In particular, 8 long-read genome assemblies have recently been published for the canine model (dogs and wolves). These assemblies were created using a range of sequencing and computational approaches, with only limited comparisons described among subsets of the assemblies. Here we present 3 high-quality de novo reference assemblies based upon Oxford Nanopore long-read sequencing: 2 Bernese Mountain Dogs (BD & OD) and a Cairn terrier (CA611). These breeds are of particular interest due to the enrichment of unresolved genetic disorders. Leveraging advancement in software technologies, we utilized published data of Labrador Retriever (Yella) to generate a new assembly, resulting in a â¼280-fold increase in continuity (N50 size of 91 kbp vs 25.75 Mbp). In conjunction with these 4 new assemblies, we uniformly assessed 8 existing assemblies for generalized quality metrics, sequence divergence, and a detailed BUSCO assessment. We identified a set of â¼400 conserved genes during the BUSCO analysis missing in all assemblies. Genome-wide methylation profiles were generated from the nanopore sequencing, resulting in broad concordance with existing whole-genome and reduced-representation bisulfite sequencing, while highlighting superior overage of mobile elements. These analyses demonstrate the ability of Nanopore sequencing to resolve the sequence and epigenetic profile of canine genomes.

Assuntos

Nanoporos , Cães , Animais , Metilação , Genoma , Análise de Sequência de DNA , Software , Sequenciamento de Nucleotídeos em Larga Escala

3.

Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture.

Meadows, Jennifer R S; Kidd, Jeffrey M; Wang, Guo-Dong; Parker, Heidi G; Schall, Peter Z; Bianchi, Matteo; Christmas, Matthew J; Bougiouri, Katia; Buckley, Reuben M; Hitte, Christophe; Nguyen, Anthony K; Wang, Chao; Jagannathan, Vidhya; Niskanen, Julia E; Frantz, Laurent A F; Arumilli, Meharji; Hundi, Sruthi; Lindblad-Toh, Kerstin; Ginja, Catarina; Agustina, Kadek Karang; André, Catherine; Boyko, Adam R; Davis, Brian W; Drögemüller, Michaela; Feng, Xin-Yao; Gkagkavouzis, Konstantinos; Iliopoulos, Giorgos; Harris, Alexander C; Hytönen, Marjo K; Kalthoff, Daniela C; Liu, Yan-Hu; Lymberakis, Petros; Poulakakis, Nikolaos; Pires, Ana Elisabete; Racimo, Fernando; Ramos-Almodovar, Fabian; Savolainen, Peter; Venetsani, Semina; Tammen, Imke; Triantafyllidis, Alexandros; vonHoldt, Bridgett; Wayne, Robert K; Larson, Greger; Nicholas, Frank W; Lohi, Hannes; Leeb, Tosso; Zhang, Ya-Ping; Ostrander, Elaine A.

Genome Biol ; 24(1): 187, 2023 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-37582787

RESUMO

BACKGROUND: The international Dog10K project aims to sequence and analyze several thousand canine genomes. Incorporating 20 × data from 1987 individuals, including 1611 dogs (321 breeds), 309 village dogs, 63 wolves, and four coyotes, we identify genomic variation across the canid family, setting the stage for detailed studies of domestication, behavior, morphology, disease susceptibility, and genome architecture and function. RESULTS: We report the analysis of > 48 M single-nucleotide, indel, and structural variants spanning the autosomes, X chromosome, and mitochondria. We discover more than 75% of variation for 239 sampled breeds. Allele sharing analysis indicates that 94.9% of breeds form monophyletic clusters and 25 major clades. German Shepherd Dogs and related breeds show the highest allele sharing with independent breeds from multiple clades. On average, each breed dog differs from the UU_Cfam_GSD_1.0 reference at 26,960 deletions and 14,034 insertions greater than 50 bp, with wolves having 14% more variants. Discovered variants include retrogene insertions from 926 parent genes. To aid functional prioritization, single-nucleotide variants were annotated with SnpEff and Zoonomia phyloP constraint scores. Constrained positions were negatively correlated with allele frequency. Finally, the utility of the Dog10K data as an imputation reference panel is assessed, generating high-confidence calls across varied genotyping platform densities including for breeds not included in the Dog10K collection. CONCLUSIONS: We have developed a dense dataset of 1987 sequenced canids that reveals patterns of allele sharing, identifies likely functional variants, informs breed structure, and enables accurate imputation. Dog10K data are publicly available.

Assuntos

Lobos , Cães , Animais , Lobos/genética , Mapeamento Cromossômico , Alelos , Polimorfismo de Nucleotídeo Único , Nucleotídeos , Demografia

4.

Mapping the Complex Genetic Landscape of Human Neurons.

Sun, Chen; Kathuria, Kunal; Emery, Sarah B; Kim, ByungJun; Burbulis, Ian E; Shin, Joo Heon; Weinberger, Daniel R; Moran, John V; Kidd, Jeffrey M; Mills, Ryan E; McConnell, Michael J.

bioRxiv ; 2023 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-36945473

RESUMO

When somatic cells acquire complex karyotypes, they are removed by the immune system. Mutant somatic cells that evade immune surveillance can lead to cancer. Neurons with complex karyotypes arise during neurotypical brain development, but neurons are almost never the origin of brain cancers. Instead, somatic mutations in neurons can bring about neurodevelopmental disorders, and contribute to the polygenic landscape of neuropsychiatric and neurodegenerative disease. A subset of human neurons harbors idiosyncratic copy number variants (CNVs, "CNV neurons"), but previous analyses of CNV neurons have been limited by relatively small sample sizes. Here, we developed an allele-based validation approach, SCOVAL, to corroborate or reject read-depth based CNV calls in single human neurons. We applied this approach to 2,125 frontal cortical neurons from a neurotypical human brain. This approach identified 226 CNV neurons, as well as a class of CNV neurons with complex karyotypes containing whole or substantial losses on multiple chromosomes. Moreover, we found that CNV location appears to be nonrandom. Recurrent regions of neuronal genome rearrangement contained fewer, but longer, genes.

5.

Recent, full-length gene retrocopies are common in canids.

Batcher, Kevin; Varney, Scarlett; York, Daniel; Blacksmith, Matthew; Kidd, Jeffrey M; Rebhun, Robert; Dickinson, Peter; Bannasch, Danika.

Genome Res ; 2022 Aug 12.

Artigo em Inglês | MEDLINE | ID: mdl-35961775

RESUMO

Gene retrocopies arise from the reverse transcription and insertion into the genome of processed mRNA transcripts. Although many retrocopies have acquired mutations that render them functionally inactive, most mammals retain active LINE-1 sequences capable of producing new retrocopies. New retrocopies, referred to as retro copy number variants (retroCNVs), may not be identified by standard variant calling techniques in high-throughput sequencing data. Although multiple functional FGF4 retroCNVs have been associated with skeletal dysplasias in dogs, the full landscape of canid retroCNVs has not been characterized. Here, retroCNV discovery was performed on a whole-genome sequencing data set of 293 canids from 76 breeds. We identified retroCNV parent genes via the presence of mRNA-specific 30-mers, and then identified retroCNV insertion sites through discordant read analysis. In total, we resolved insertion sites for 1911 retroCNVs from 1179 parent genes, 1236 of which appeared identical to their parent genes. Dogs had on average 54.1 total retroCNVs and 1.4 private retroCNVs. We found evidence of expression in testes for 12% (14/113) of the retroCNVs identified in six Golden Retrievers, including four chimeric transcripts, and 97 retroCNVs also had significantly elevated F ST across dog breeds, possibly indicating selection. We applied our approach to a subset of human genomes and detected an average of 4.2 retroCNVs per sample, highlighting a 13-fold relative increase of retroCNV frequency in dogs. Particularly in canids, retroCNVs are a largely unexplored source of genetic variation that can contribute to genome plasticity and that should be considered when investigating traits and diseases.

6.

Author Correction: Comparative and demographic analysis of orang-utan genomes.

Locke, Devin P; Hillier, LaDeana W; Warren, Wesley C; Worley, Kim C; Nazareth, Lynne V; Muzny, Donna M; Yang, Shiaw-Pyng; Wang, Zhengyuan; Chinwalla, Asif T; Minx, Pat; Mitreva, Makedonka; Cook, Lisa; Delehaunty, Kim D; Fronick, Catrina; Schmidt, Heather; Fulton, Lucinda A; Fulton, Robert S; Nelson, Joanne O; Magrini, Vincent; Pohl, Craig; Graves, Tina A; Markovic, Chris; Cree, Andy; Dinh, Huyen H; Hume, Jennifer; Kovar, Christie L; Fowler, Gerald R; Lunter, Gerton; Meader, Stephen; Heger, Andreas; Ponting, Chris P; Marques-Bonet, Tomas; Alkan, Can; Chen, Lin; Cheng, Ze; Kidd, Jeffrey M; Eichler, Evan E; White, Simon; Searle, Stephen; Vilella, Albert J; Chen, Yuan; Flicek, Paul; Ma, Jian; Raney, Brian; Suh, Bernard; Burhans, Richard; Herrero, Javier; Haussler, David; Faria, Rui; Fernando, Olga.

Nature ; 608(7924): E36, 2022 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-35962045

7.

Bimodal Expression Patterns, and Not Viral Burst Sizes, Predict the Effects of Vpr on HIV-1 Proviral Populations in Jurkat Cells.

Atindaana, Edmond; Kissi-Twum, Abena; Emery, Sarah; Burnett, Cleo; Pitcher, Jake; Visser, Myra; Kidd, Jeffrey M; Telesnitsky, Alice.

mBio ; 13(2): e0374821, 2022 04 26.

Artigo em Inglês | MEDLINE | ID: mdl-35384697

RESUMO

Integration site landscapes, clonal dynamics, and latency reversal with or without vpr were compared in HIV-1-infected Jurkat cell populations, and the properties of individual clones were defined. Clones differed in fractions of long terminal repeat (LTR)-active daughter cells, with some clones containing few to no LTR-active cells, while almost all cells were LTR active for others. Clones varied over 4 orders of magnitude in virus release per active cell. Proviruses in largely LTR-active clones were closer to preexisting enhancers and promoters than low-LTR-active clones. Unsurprisingly, major vpr+ clones contained fewer LTR-active cells than vpr- clones, and predominant vpr+ proviruses were farther from enhancers and promoters than those in vpr- pools. Distances to these marks among intact proviruses previously reported for antiretroviral therapy (ART)-suppressed patients revealed that patient integration sites were more similar to those in the vpr+ pool than to vpr- integrants. Complementing vpr-defective proviruses with vpr led to the rapid loss of highly LTR-active clones, indicating that the effect of Vpr on proviral populations occurred after integration. However, major clones in the complemented pool and its vpr- parent population did not differ in burst sizes. When the latency reactivation agents prostratin and JQ1 were applied separately or in combination, vpr+ and vpr- population-wide trends were similar, with dual-treatment enhancement being due in part to reactivated clones that did not respond to either drug applied separately. However, the expression signatures of individual clones differed between populations. These observations highlight how Vpr, exerting selective pressure on proviral epigenetic variation, can shape integration site landscapes, proviral expression patterns, and reactivation properties. IMPORTANCE A bedrock assumption in HIV-1 population modeling is that all active cells release the same amount of virus. However, the findings here revealed that when HIV-infected cells expand into clones, each clone differs in virus production. Reasoning that this variation in expression patterns constituted a population of clones from which differing subsets would prevail under differing environmental conditions, the cytotoxic HIV-1 protein Vpr was introduced, and population dynamics and expression properties were compared in the presence and absence of Vpr. The results showed that whereas most clones produced fairly continuous levels of virus in the absence of Vpr, its presence selected for a distinct subset of clones with properties reminiscent of persistent populations in patients, suggesting the possibility that the interclonal variation in expression patterns observed in culture may contribute to proviral persistence in vivo.

Assuntos

Soropositividade para HIV , HIV-1 , HIV-1/fisiologia , Humanos , Células Jurkat , Provírus/genética , Produtos do Gene vpr do Vírus da Imunodeficiência Humana/genética , Produtos do Gene vpr do Vírus da Imunodeficiência Humana/metabolismo

8.

Canis familiaris (Great Dane domestic dog).

Halo, Julia V; Kidd, Jeffrey M.

Trends Genet ; 38(5): 514-515, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35232612

Assuntos

Doenças do Cão , Animais , Cães

9.

Dog10K_Boxer_Tasha_1.0: A Long-Read Assembly of the Dog Reference Genome.

Jagannathan, Vidhya; Hitte, Christophe; Kidd, Jeffrey M; Masterson, Patrick; Murphy, Terence D; Emery, Sarah; Davis, Brian; Buckley, Reuben M; Liu, Yan-Hu; Zhang, Xiang-Quan; Leeb, Tosso; Zhang, Ya-Ping; Ostrander, Elaine A; Wang, Guo-Dong.

Genes (Basel) ; 12(6)2021 05 30.

Artigo em Inglês | MEDLINE | ID: mdl-34070911

RESUMO

The domestic dog has evolved to be an important biomedical model for studies regarding the genetic basis of disease, morphology and behavior. Genetic studies in the dog have relied on a draft reference genome of a purebred female boxer dog named "Tasha" initially published in 2005. Derived from a Sanger whole genome shotgun sequencing approach coupled with limited clone-based sequencing, the initial assembly and subsequent updates have served as the predominant resource for canine genetics for 15 years. While the initial assembly produced a good-quality draft, as with all assemblies produced at the time, it contained gaps, assembly errors and missing sequences, particularly in GC-rich regions, which are found at many promoters and in the first exons of protein-coding genes. Here, we present Dog10K_Boxer_Tasha_1.0, an improved chromosome-level highly contiguous genome assembly of Tasha created with long-read technologies that increases sequence contiguity >100-fold, closes >23,000 gaps of the CanFam3.1 reference assembly and improves gene annotation by identifying >1200 new protein-coding transcripts. The assembly and annotation are available at NCBI under the accession GCF_000002285.5.

Assuntos

Cães/genética , Genoma , Animais , Mapeamento de Sequências Contíguas , Anotação de Sequência Molecular

10.

Long-read assembly of a Great Dane genome highlights the contribution of GC-rich sequence and mobile elements to canine genomes.

Halo, Julia V; Pendleton, Amanda L; Shen, Feichen; Doucet, Aurélien J; Derrien, Thomas; Hitte, Christophe; Kirby, Laura E; Myers, Bridget; Sliwerska, Elzbieta; Emery, Sarah; Moran, John V; Boyko, Adam R; Kidd, Jeffrey M.

Proc Natl Acad Sci U S A ; 118(11)2021 03 16.

Artigo em Inglês | MEDLINE | ID: mdl-33836575

RESUMO

Technological advances have allowed improvements in genome reference sequence assemblies. Here, we combined long- and short-read sequence resources to assemble the genome of a female Great Dane dog. This assembly has improved continuity compared to the existing Boxer-derived (CanFam3.1) reference genome. Annotation of the Great Dane assembly identified 22,182 protein-coding gene models and 7,049 long noncoding RNAs, including 49 protein-coding genes not present in the CanFam3.1 reference. The Great Dane assembly spans the majority of sequence gaps in the CanFam3.1 reference and illustrates that 2,151 gaps overlap the transcription start site of a predicted protein-coding gene. Moreover, a subset of the resolved gaps, which have an 80.95% median GC content, localize to transcription start sites and recombination hotspots more often than expected by chance, suggesting the stable canine recombinational landscape has shaped genome architecture. Alignment of the Great Dane and CanFam3.1 assemblies identified 16,834 deletions and 15,621 insertions, as well as 2,665 deletions and 3,493 insertions located on secondary contigs. These structural variants are dominated by retrotransposon insertion/deletion polymorphisms and include 16,221 dimorphic canine short interspersed elements (SINECs) and 1,121 dimorphic long interspersed element-1 sequences (LINE-1_Cfs). Analysis of sequences flanking the 3' end of LINE-1_Cfs (i.e., LINE-1_Cf 3'-transductions) suggests multiple retrotransposition-competent LINE-1_Cfs segregate among dog populations. Consistent with this conclusion, we demonstrate that a canine LINE-1_Cf element with intact open reading frames can retrotranspose its own RNA and that of a SINEC_Cf consensus sequence in cultured human cells, implicating ongoing retrotransposon activity as a driver of canine genetic variation.

Assuntos

Cães/genética , Sequência Rica em GC , Genoma , Sequências Repetitivas Dispersas , Animais , Cães/classificação , Elementos Nucleotídeos Longos e Dispersos , Elementos Nucleotídeos Curtos e Dispersos , Especificidade da Espécie

11.

Comprehensive identification of somatic nucleotide variants in human brain tissue.

Wang, Yifan; Bae, Taejeong; Thorpe, Jeremy; Sherman, Maxwell A; Jones, Attila G; Cho, Sean; Daily, Kenneth; Dou, Yanmei; Ganz, Javier; Galor, Alon; Lobon, Irene; Pattni, Reenal; Rosenbluh, Chaggai; Tomasi, Simone; Tomasini, Livia; Yang, Xiaoxu; Zhou, Bo; Akbarian, Schahram; Ball, Laurel L; Bizzotto, Sara; Emery, Sarah B; Doan, Ryan; Fasching, Liana; Jang, Yeongjun; Juan, David; Lizano, Esther; Luquette, Lovelace J; Moldovan, John B; Narurkar, Rujuta; Oetjens, Matthew T; Rodin, Rachel E; Sekar, Shobana; Shin, Joo Heon; Soriano, Eduardo; Straub, Richard E; Zhou, Weichen; Chess, Andrew; Gleeson, Joseph G; Marquès-Bonet, Tomas; Park, Peter J; Peters, Mette A; Pevsner, Jonathan; Walsh, Christopher A; Weinberger, Daniel R; Vaccarino, Flora M; Moran, John V; Urban, Alexander E; Kidd, Jeffrey M; Mills, Ryan E; Abyzov, Alexej.

Genome Biol ; 22(1): 92, 2021 03 29.

Artigo em Inglês | MEDLINE | ID: mdl-33781308

RESUMO

BACKGROUND: Post-zygotic mutations incurred during DNA replication, DNA repair, and other cellular processes lead to somatic mosaicism. Somatic mosaicism is an established cause of various diseases, including cancers. However, detecting mosaic variants in DNA from non-cancerous somatic tissues poses significant challenges, particularly if the variants only are present in a small fraction of cells. RESULTS: Here, the Brain Somatic Mosaicism Network conducts a coordinated, multi-institutional study to examine the ability of existing methods to detect simulated somatic single-nucleotide variants (SNVs) in DNA mixing experiments, generate multiple replicates of whole-genome sequencing data from the dorsolateral prefrontal cortex, other brain regions, dura mater, and dural fibroblasts of a single neurotypical individual, devise strategies to discover somatic SNVs, and apply various approaches to validate somatic SNVs. These efforts lead to the identification of 43 bona fide somatic SNVs that range in variant allele fractions from ~ 0.005 to ~ 0.28. Guided by these results, we devise best practices for calling mosaic SNVs from 250× whole-genome sequencing data in the accessible portion of the human genome that achieve 90% specificity and sensitivity. Finally, we demonstrate that analysis of multiple bulk DNA samples from a single individual allows the reconstruction of early developmental cell lineage trees. CONCLUSIONS: This study provides a unified set of best practices to detect somatic SNVs in non-cancerous tissues. The data and methods are freely available to the scientific community and should serve as a guide to assess the contributions of somatic SNVs to neuropsychiatric diseases.

Assuntos

Encéfalo/metabolismo , Estudos de Associação Genética , Variação Genética , Alelos , Mapeamento Cromossômico , Biologia Computacional/métodos , Estudos de Associação Genética/métodos , Genômica/métodos , Células Germinativas/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Especificidade de Órgãos/genética , Polimorfismo de Nucleotídeo Único

12.

Genomic Copy Number Variation Study of Nine Macaca Species Provides New Insights into Their Genetic Divergence, Adaptation, and Biomedical Application.

Li, Jing; Fan, Zhenxin; Shen, Feichen; Pendleton, Amanda L; Song, Yang; Xing, Jinchuan; Yue, Bisong; Kidd, Jeffrey M; Li, Jing.

Genome Biol Evol ; 12(12): 2211-2230, 2020 12 06.

Artigo em Inglês | MEDLINE | ID: mdl-32970804

RESUMO

Copy number variation (CNV) can promote phenotypic diversification and adaptive evolution. However, the genomic architecture of CNVs among Macaca species remains scarcely reported, and the roles of CNVs in adaptation and evolution of macaques have not been well addressed. Here, we identified and characterized 1,479 genome-wide hetero-specific CNVs across nine Macaca species with bioinformatic methods, along with 26 CNV-dense regions and dozens of lineage-specific CNVs. The genes intersecting CNVs were overrepresented in nutritional metabolism, xenobiotics/drug metabolism, and immune-related pathways. Population-level transcriptome data showed that nearly 46% of CNV genes were differentially expressed across populations and also mainly consisted of metabolic and immune-related genes, which implied the role of CNVs in environmental adaptation of Macaca. Several CNVs overlapping drug metabolism genes were verified with genomic quantitative polymerase chain reaction, suggesting that these macaques may have different drug metabolism features. The CNV-dense regions, including 15 first reported here, represent unstable genomic segments in macaques where biological innovation may evolve. Twelve gains and 40 losses specific to the Barbary macaque contain genes with essential roles in energy homeostasis and immunity defense, inferring the genetic basis of its unique distribution in North Africa. Our study not only elucidated the genetic diversity across Macaca species from the perspective of structural variation but also provided suggestive evidence for the role of CNVs in adaptation and genome evolution. Additionally, our findings provide new insights into the application of diverse macaques to drug study.

Assuntos

Adaptação Biológica , Evolução Biológica , Variações do Número de Cópias de DNA , Duplicação Gênica , Macaca/genética , Animais

13.

TypeTE: a tool to genotype mobile element insertions from whole genome resequencing data.

Goubert, Clément; Thomas, Jainy; Payer, Lindsay M; Kidd, Jeffrey M; Feusier, Julie; Watkins, W Scott; Burns, Kathleen H; Jorde, Lynn B; Feschotte, Cédric.

Nucleic Acids Res ; 48(6): e36, 2020 04 06.

Artigo em Inglês | MEDLINE | ID: mdl-32067044

RESUMO

Alu retrotransposons account for more than 10% of the human genome, and insertions of these elements create structural variants segregating in human populations. Such polymorphic Alus are powerful markers to understand population structure, and they represent variants that can greatly impact genome function, including gene expression. Accurate genotyping of Alus and other mobile elements has been challenging. Indeed, we found that Alu genotypes previously called for the 1000 Genomes Project are sometimes erroneous, which poses significant problems for phasing these insertions with other variants that comprise the haplotype. To ameliorate this issue, we introduce a new pipeline - TypeTE - which genotypes Alu insertions from whole-genome sequencing data. Starting from a list of polymorphic Alus, TypeTE identifies the hallmarks (poly-A tail and target site duplication) and orientation of Alu insertions using local re-assembly to reconstruct presence and absence alleles. Genotype likelihoods are then computed after re-mapping sequencing reads to the reconstructed alleles. Using a high-quality set of PCR-based genotyping of >200 loci, we show that TypeTE improves genotype accuracy from 83% to 92% in the 1000 Genomes dataset. TypeTE can be readily adapted to other retrotransposon families and brings a valuable toolbox addition for population genomics.

Assuntos

Sequências Repetitivas Dispersas/genética , Mutagênese Insercional/genética , Software , Sequenciamento Completo do Genoma/métodos , Bases de Dados Genéticas , Frequência do Gene/genética , Loci Gênicos , Genética Populacional , Genoma Humano , Genótipo , Humanos

14.

Rapid, Paralog-Sensitive CNV Analysis of 2457 Human Genomes Using QuicK-mer2.

Shen, Feichen; Kidd, Jeffrey M.

Genes (Basel) ; 11(2)2020 01 29.

Artigo em Inglês | MEDLINE | ID: mdl-32013076

RESUMO

Gene duplication is a major mechanism for the evolution of gene novelty, and copy-number variation makes a major contribution to inter-individual genetic diversity. However, most approaches for studying copy-number variation rely upon uniquely mapping reads to a genome reference and are unable to distinguish among duplicated sequences. Specialized approaches to interrogate specific paralogs are comparatively slow and have a high degree of computational complexity, limiting their effective application to emerging population-scale data sets. We present QuicK-mer2, a self-contained, mapping-free approach that enables the rapid construction of paralog-specific copy-number maps from short-read sequence data. This approach is based on the tabulation of unique k-mer sequences from short-read data sets, and is able to analyze a 20X coverage human genome in approximately 20 min. We applied our approach to newly released sequence data from the 1000 Genomes Project, constructed paralog-specific copy-number maps from 2457 unrelated individuals, and uncovered copy-number variation of paralogous genes. We identify nine genes where none of the analyzed samples have a copy number of two, 92 genes where the majority of samples have a copy number other than two, and describe rare copy number variation effecting multiple genes at the APOBEC3 locus.

Assuntos

Biologia Computacional/métodos , Variações do Número de Cópias de DNA , Análise de Sequência de DNA/métodos , Algoritmos , Evolução Molecular , Duplicação Gênica , Genoma Humano , Humanos

15.

Identification and characterization of occult human-specific LINE-1 insertions using long-read sequencing technology.

Zhou, Weichen; Emery, Sarah B; Flasch, Diane A; Wang, Yifan; Kwan, Kenneth Y; Kidd, Jeffrey M; Moran, John V; Mills, Ryan E.

Nucleic Acids Res ; 48(3): 1146-1163, 2020 02 20.

Artigo em Inglês | MEDLINE | ID: mdl-31853540

RESUMO

Long Interspersed Element-1 (LINE-1) retrotransposition contributes to inter- and intra-individual genetic variation and occasionally can lead to human genetic disorders. Various strategies have been developed to identify human-specific LINE-1 (L1Hs) insertions from short-read whole genome sequencing (WGS) data; however, they have limitations in detecting insertions in complex repetitive genomic regions. Here, we developed a computational tool (PALMER) and used it to identify 203 non-reference L1Hs insertions in the NA12878 benchmark genome. Using PacBio long-read sequencing data, we identified L1Hs insertions that were absent in previous short-read studies (90/203). Approximately 81% (73/90) of the L1Hs insertions reside within endogenous LINE-1 sequences in the reference assembly and the analysis of unique breakpoint junction sequences revealed 63% (57/90) of these L1Hs insertions could be genotyped in 1000 Genomes Project sequences. Moreover, we observed that amplification biases encountered in single-cell WGS experiments led to a wide variation in L1Hs insertion detection rates between four individual NA12878 cells; under-amplification limited detection to 32% (65/203) of insertions, whereas over-amplification increased false positive calls. In sum, these data indicate that L1Hs insertions are often missed using standard short-read sequencing approaches and long-read sequencing approaches can significantly improve the detection of L1Hs insertions present in individual genomes.

Assuntos

Elementos Nucleotídeos Longos e Dispersos , Análise de Sequência de DNA/métodos , Linhagem Celular , Genoma Humano , Humanos , Polimorfismo Genético , Análise de Célula Única , Software , Sequenciamento Completo do Genoma

16.

Dog10K: the International Consortium of Canine Genome Sequencing.

Wang, Guo-Dong; Larson, Greger; Kidd, Jeffrey M; vonHoldt, Bridgett M; Ostrander, Elaine A; Zhang, Ya-Ping.

Natl Sci Rev ; 6(4): 611-613, 2019 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-31598382

17.

Stable integrant-specific differences in bimodal HIV-1 expression patterns revealed by high-throughput analysis.

Read, David F; Atindaana, Edmond; Pyaram, Kalyani; Yang, Feng; Emery, Sarah; Cheong, Anna; Nakama, Katherine R; Burnett, Cleo; Larragoite, Erin T; Battivelli, Emilie; Verdin, Eric; Planelles, Vicente; Chang, Cheong-Hee; Telesnitsky, Alice; Kidd, Jeffrey M.

PLoS Pathog ; 15(10): e1007903, 2019 10.

Artigo em Inglês | MEDLINE | ID: mdl-31584995

RESUMO

HIV-1 gene expression is regulated by host and viral factors that interact with viral motifs and is influenced by proviral integration sites. Here, expression variation among integrants was followed for hundreds of individual proviral clones within polyclonal populations throughout successive rounds of virus and cultured cell replication, with limited findings using CD4+ cells from donor blood consistent with observations in immortalized cells. Tracking clonal behavior by proviral "zip codes" indicated that mutational inactivation during reverse transcription was rare, while clonal expansion and proviral expression states varied widely. By sorting for provirus expression using a GFP reporter in the nef open reading frame, distinct clone-specific variation in on/off proportions were observed that spanned three orders of magnitude. Tracking GFP phenotypes over time revealed that as cells divided, their progeny alternated between HIV transcriptional activity and non-activity. Despite these phenotypic oscillations, the overall GFP+ population within each clone was remarkably stable, with clones maintaining clone-specific equilibrium mixtures of GFP+ and GFP- cells. Integration sites were analyzed for correlations between genomic features and the epigenetic phenomena described here. Integrants inserted in the sense orientation of genes were more frequently found to be GFP negative than those in the antisense orientation, and clones with high GFP+ proportions were more distal to repressive H3K9me3 peaks than low GFP+ clones. Clones with low frequencies of GFP positivity appeared to expand more rapidly than clones for which most cells were GFP+, even though the tested proviruses were Vpr-. Thus, much of the increase in the GFP- population in these polyclonal pools over time reflected differential clonal expansion. Together, these results underscore the temporal and quantitative variability in HIV-1 gene expression among proviral clones that are conferred in the absence of metabolic or cell-type dependent variability, and shed light on cell-intrinsic layers of regulation that affect HIV-1 population dynamics.

Assuntos

Linfócitos T CD4-Positivos/virologia , Infecções por HIV/virologia , HIV-1/fisiologia , Provírus/genética , Integração Viral/genética , Replicação Viral , Linfócitos T CD4-Positivos/metabolismo , Infecções por HIV/genética , Ensaios de Triagem em Larga Escala , Humanos , Células Jurkat , Transdução Genética

18.

A Neofunctionalized X-Linked Ampliconic Gene Family Is Essential for Male Fertility and Equal Sex Ratio in Mice.

Kruger, Alyssa N; Brogley, Michele A; Huizinga, Jamie L; Kidd, Jeffrey M; de Rooij, Dirk G; Hu, Yueh-Chiang; Mueller, Jacob L.

Curr Biol ; 29(21): 3699-3706.e5, 2019 11 04.

Artigo em Inglês | MEDLINE | ID: mdl-31630956

RESUMO

The mammalian sex chromosomes harbor an abundance of newly acquired ampliconic genes, although their functions require elucidation [1-9]. Here, we demonstrate that the X-linked Slx and Slxl1 ampliconic gene families represent mouse-specific neofunctionalized copies of a meiotic synaptonemal complex protein, Sycp3. In contrast to the meiotic role of Sycp3, CRISPR-loxP-mediated multi-megabase deletions of the Slx (5 Mb) and Slxl1 (2.3Mb) ampliconic regions result in post-meiotic defects, abnormal sperm, and male infertility. Males carrying Slxl1 deletions sire more male offspring, whereas males carrying Slx and Slxl1 duplications sire more female offspring, which directly correlates with Slxl1 gene dosage and gene expression levels. SLX and SLXL1 proteins interact with spindlin protein family members (SPIN1 and SSTY1/2) and males carrying Slxl1 deletions downregulate a sex chromatin modifier, Scml2, leading us to speculate that Slx and Slxl1 function in chromatin regulation. Our study demonstrates how newly acquired X-linked genes can rapidly evolve new and essential functions and how gene amplification can increase sex chromosome transmission.

Assuntos

Fertilidade/genética , Genes Ligados ao Cromossomo X/genética , Família Multigênica/genética , Cromossomos Sexuais/genética , Razão de Masculinidade , Animais , Feminino , Dosagem de Genes , Expressão Gênica , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos DBA

19.

Origin and recent expansion of an endogenous gammaretroviral lineage in domestic and wild canids.

Halo, Julia V; Pendleton, Amanda L; Jarosz, Abigail S; Gifford, Robert J; Day, Malika L; Kidd, Jeffrey M.

Retrovirology ; 16(1): 6, 2019 03 07.

Artigo em Inglês | MEDLINE | ID: mdl-30845962

RESUMO

BACKGROUND: Vertebrate genomes contain a record of retroviruses that invaded the germlines of ancestral hosts and are passed to offspring as endogenous retroviruses (ERVs). ERVs can impact host function since they contain the necessary sequences for expression within the host. Dogs are an important system for the study of disease and evolution, yet no substantiated reports of infectious retroviruses in dogs exist. Here, we utilized Illumina whole genome sequence data to assess the origin and evolution of a recently active gammaretroviral lineage in domestic and wild canids. RESULTS: We identified numerous recently integrated loci of a canid-specific ERV-Fc sublineage within Canis, including 58 insertions that were absent from the reference assembly. Insertions were found throughout the dog genome including within and near gene models. By comparison of orthologous occupied sites, we characterized element prevalence across 332 genomes including all nine extant canid species, revealing evolutionary patterns of ERV-Fc segregation among species as well as subpopulations. CONCLUSIONS: Sequence analysis revealed common disruptive mutations, suggesting a predominant form of ERV-Fc spread by trans complementation of defective proviruses. ERV-Fc activity included multiple circulating variants that infected canid ancestors from the last 20 million to within 1.6 million years, with recent bursts of germline invasion in the sublineage leading to wolves and dogs.

Assuntos

Canidae , Retrovirus Endógenos/classificação , Retrovirus Endógenos/genética , Evolução Molecular , Infecções por Retroviridae/veterinária , Animais , Biologia Computacional , Sequenciamento de Nucleotídeos em Larga Escala , Provírus/classificação , Provírus/genética , Infecções por Retroviridae/virologia

20.

Correction: A 32 kb Critical Region Excluding Y402H in CFH Mediates Risk for Age-Related Macular Degeneration.

Sivakumaran, Theru A; Igo, Robert P; Kidd, Jeffrey M; Itsara, Andy; Kopplin, Laura J; Chen, Wei; Hagstrom, Stephanie A; Peachey, Neal S; Francis, Peter J; Klein, Michael L; Chew, Emily Y; Ramprasad, Vedam L; Tay, Wan-Ting; Mitchell, Paul; Seielstad, Mark; Stambolian, Dwight E; Edwards, Albert O; Lee, Kristine E; Leontiev, Dmitry V; Jun, Gyungah; Wang, Yang; Tian, Liping; Qiu, Feiyou; Henning, Alice K; LaFramboise, Thomas; Sen, Parveen; Aarthi, Manoharan; George, Ronnie; Raman, Rajiv; Das, Manmath Kumar; Vijaya, Lingam; Kumaramanickavel, Govindasamy; Wong, Tien Y; Swaroop, Anand; Abecasis, Goncalo R; Klein, Ronald; Klein, Barbara E K; Nickerson, Deborah A; Eichler, Evan E; Iyengar, Sudha K.

PLoS One ; 13(12): e0209943, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30571798

RESUMO

[This corrects the article DOI: 10.1371/journal.pone.0025598.].

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA